The persistence of social signatures in human communication 
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The social brain hypothesis has suggested that social network size (and structure) is constrained 
by a combination of cognitive processes and the time required to service social relationships. We 
test this hypothesis in humans using a unique 18-month mobile phone dataset by examining changes 
in the structure of social networks across a major change in subjects' social and geographical cir- 
cumstances. Our analysis reveals that the time allocation patterns of call frequency by participants 
to network members have a distinctive overall shape, where a small number of top-ranked network 
members received a disproportionately large fraction of calls, with some individual variation. How- 
ever, importantly, whilst there was a large turnover of individual network members, these changes 
have little effect on the time allocation patterns of each individual: individuals thus displayed a 
distinctive "social signature" that was both persistent over time and independent of the identities of 
the network members. This provides the first direct evidence that social networks are constrained by 
a combination of cognitive constraints and the time individuals have available for social interaction, 
confirming one of the key assumptions of the social brain hypothesis. 
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I. INTRODUCTION 

The social brain hypothesis [l|, suggests that there 
are general time and cognitive constraints on the number 
of relationships an individual can maintain at particular 
levels of emotional intensity @, H[. Close, emotionally 
intense relationships appear to play an especially impor- 
tant functional role in social species like primates: having 
strong and supportive relationships is essential for health 
and wellbeing [5, 6] and is known to have an impact on 
females' fitness [7|,[8|]. Since time is inelastic and, at least 
in humans, there is a direct relationship between the time 
devoted to a relationship and its emotional strength 
this can be expected to result in a trade off between quan- 
tity and quality of relationships In addition, the so- 
cial brain hypothesis 0, @ and subsequent neuroimaging 
studies that have tested this [Iol - fl3| suggest that there is 
also a cognitive constraint on the number of relationships 
that an individual can maintain. These constraints result 
in a layered structure to personal networks, such that an 
individual can be envisaged as sitting in the centre of a 
series of concentric circles of acquaintanceship, with the 
relationships in these layers increasing in number but de- 
creasing in emotional intensity [1, @ . 

Although it has always been assumed that these layers 
are fixed in size, it has never formally been shown that 
individuals do not (or cannot) increase the number of 
close relationships (i.e. the number of individuals in any 
given network layer) when they form new relationships. 
Whilst there is often considerable instability in individ- 
ual social relationships [l4| , it has in fact been suggested 
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that there is a greater level of stability at the level of 
personal networks - the set of ties an individual (ego) 
has to their family and friends (alters) [HI, [n| . We use 
a unique 18 month longitudinal dataset on humans that 
combines detailed data on communication patterns from 
mobile phone records with questionnaire data to explore 
changes in the personal networks of individuals under- 
going a major social transition: the move from school 
to university. We test the hypothesis that the number 
of close relationships and the time invested in them re- 
main invariant even when there is significant turnover in 
network membership. 

Our approach extends previous work in this area in 
three key ways. First, we have complete records of all 
calls an ego made to alters in their personal network over 
18 months (including calls to landline numbers), rather 
than just a subset of calls an ego made to alters who 
happened to be on the same mobile network as them, 
as has usually been the case in previous work. Second, 
by combining information from the phone records with 
questionnaire data, we are able to uncover the structure 
of personal networks, in terms of how the nature of social 
relationships relates to calling patterns. Third, we are 
able to determine the proportion of an ego's personal 
network captured by the phone records, as well as the 
characteristics of the alters present in personal networks 
but not present in the phone records. We rank all alters 
of an ego based on the number of calls they receive, and 
are thus able to establish each ego's distribution of time 
(quantified by mobile phone calls) over alter ranks, an 
ego's "social signature" . We then determine whether this 
social signature persists during a period of flux for social 
relationships with many alters both entering and leaving 
the network @, Q3, [l8| . 
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II. METHODS 
A. Personal network survey and call records 

We used longitudinal data on the social networks of 
thirty participants (15 males and 15 females, aged 17 to 
19 years old: mean±SD age 18.1±0.48) in their last year 
of secondary school, collected over an 18-month period 
during the transition from school to university (for full 
details, see Roberts & Dunbar j9>]). Participants com- 
pleted a questionnaire on their active personal network 
at three points in time: at the beginning of the study (ii), 
at 9 months (£2) and at 18 months (£3). The analysis in 
this study is based on the 24 participants (12 males, 12 
females) who completed all three questionnaires and fre- 
quently used their mobile phones throughout the study. 
To elicit their personal network, participants were asked 
to list all unrelated individuals " for whom you have con- 
tact details and with whom you consider that you have 
some kind of personal relationship (friend, acquaintance, 
someone you might interact with on a regular basis at 
school, work or university)" . The participants were also 
asked to list all their known relatives. For all individuals 
listed, participants were asked to provide both landline 
and mobile phone numbers. In each survey (t\, t% , £3), 
for both kin and friends/acquaintances, the participants 
were asked to indicate the emotional intensity of the rela- 
tionship by providing an emotional closeness score, mea- 
sured on a 1-10 scale, where 10 is someone "with whom 
you have a deeply personal relationship" . 

At ti, all participants lived in the same large UK city 
("City A"). At month 4 of the study, the participants 
took their final exams at school ("A- levels") and left 
the school. Of the 24 participants who completed all 
three questionnaires, six participants stayed in City A 
and worked, not going to University; eight went to uni- 
versity in City A (which has two large universities) and 
the remaining 10 went to university elsewhere in Eng- 
land. 

In compensation for participating in the study, par- 
ticipants were given a mobile phone, with an 18-month 
contract from a major UK mobile telephone operator. 
The line rental for the mobile phone was paid for, and 
included 500 free monthly voice minutes (to landlines or 
mobiles) and unlimited free text messages. For each par- 
ticipant, we obtained itemized, electronic monthly phone 
invoices that listed all outgoing calls (recipient phone 
number, time and duration of calls). The electronic PDF 
invoices were parsed into machine-readable form. The 
questionnaire data and the call dataset form the main 
basis for our analysis. 



combined it with the electronic phone invoices to con- 
struct a set of ego-centric call networks. If an alter was 
listed as having multiple phone numbers, a mobile and a 
fixed line number, a call by the ego to either number was 
recorded as a call between ego and alter. Phone num- 
bers appearing on the invoices but not listed in the ques- 
tionnaire responses were treated as unique alters; how- 
ever, service numbers (such as those with 0800 prefixes) 
were filtered out. The 18-month observation period of 
electronic phone invoices was divided into three consec- 
utive intervals of 6 months each March- August, I2: 
September-February, J3: March- August). For each ego in 
each of the three intervals, we counted the total number 
of his/her outgoing calls and the number of calls made 
to each alter. Comparing the ego-alter relationships, as 
reported by the egos via emotional closeness scores from 
the survey data, with the egos' real calling behaviour, we 
determined the fraction of self-reported ego relationships 
appearing in the calling records. Using the alter-call- 
counts per interval, we ranked the egos from most called 
to least called, calculated a time allocation pattern (Zipf 
plot) depicting the total fraction of calls to an alter as a 
function of the alter's rank, and calculated average emo- 
tional closeness as a function of alter's rank for all 24 
egos. 



C. Comparison of ego-reported relationships to 
phone call records 

In most previous studies of human communication us- 
ing auto-recorded data d, [l9l - l25| an alter appears in the 
data only if there is communication between the ego and 
alter. Thus, if communication occurs between ego and 
alter via a channel not being studied (e.g. landline calls, 
calls on other mobile networks to the one under inves- 
tigation) the alter is never known. Here we use the list 
of alters, kin and friends/acquaintances, from the sur- 
vey data and the ego-reported emotional closeness score 
for these alters to understand the characteristics of those 
alters missing from the data call pattern analysis. 

Let us consider the calling behaviour of each ego to- 
wards its alters of varying emotional closeness. Let 
A(g,ct i ,Ii) be the set of alters of ego g called in time 
interval U that were categorized in the survey at time U 
with emotional closeness Ct i . Similarly, let L (g, c^, Ii) be 
the set of alters of specified emotional closeness during 
time interval h that were callable by ego g. An alter was 
callable during time interval Ii if the alter was first listed 
in the survey data at tj or was in the set A {g, o, Ij) where 
i > j. The fraction of alters called by g with emotional 
closeness Ct i in time interval Ii is simply, 



B. Constructing ego-centric call networks 



For each participant in the study (ego), we used the 
list of kin and friends/acquaintances (alters) generated in 
response to the three social network questionnaires and 



(1) 



where numerator and denominator give the cardinality 
for each set. 
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D. Analyzing time allocation patterns 

We quantify the variation between the sets of alters an 
ego calls in two time intervals with the Jaccard coeffi- 
cient. 



J ^'^-|^)u^(J,)| 



(2) 



where A(Ii) and A(Ij) are the sets of alters called by the 
ego in two time intervals Ii and ij, respectively. Then 
J = 1 if the sets are the same, and J = if the sets 
have no common alters. For a pairwise comparison of 
the time allocation patterns between two different egos 
or two different time intervals for a single ego we measure 
the Jensen-Shannon divergence (JSD) (2(| defined as 



\p^ -^[H(P 1 ) + H(P 2 )}, 

(3) 

where P\ and P2 are the two time allocation patterns 
where Pi = {Pi(r)} such that Pi(r) is the fraction of calls 
to the alter of rank r in pattern i. Additionally, H(P) is 
the Shannon entropy, 



H(P) = -J2p(r) logp(r), 



(4) 



where p(r) is as above and k is the maximum rank, 
i.e. the total number of alters called. The Jensen- 
Shannon divergence is a generalized form of the Kullback- 
Leibler divergence (KLD) such that JSD(Pi,P 2 ) £ 
[0,oo), and JSD(Pi, P2) — iff the distributions are 
identical. We chose JSD over KLD due to its capacity 
to deal with zero probabilities p(r) — 0. The maximum 
number of alters called by an ego in a given time interval, 
k, varies depending on the ego and the interval; therefore, 
if ki > k\ is the larger number, we assign pi{r{) = for 
ki > r\ > k 2 , i. e. zero-pad the series of fractions of calls 
such that they are of the same length. Additionally, for 
validating the pairwise comparison results, we also cal- 
culated the ^ 2 -norm for pairs of time allocation patterns, 



defined as i 2 = \/Z) r =i \P^( r ) ~ P^( r )\ 



III. RESULTS 



Strength of ego-identified relationships and real 
calling behaviour 



The calls egos place to their alters are related to the 
strength of the ego-alter relationship, as measured by the 
ego-reported emotional closeness score. In plotting the 
average number of alters an ego will call in a 6-month 
time interval, Ii, as a function of the emotional closeness 
score, c ti , we see a positive relationship between fraction 
of alters called and the average alter emotional closeness 



score (Fig. Q] main panel). The small sample size does 
result in large values for standard deviation, as shown by 
the shaded regions. Furthermore, we see that on aver- 
age an ego will place at least one call within a given 6- 
month time interval to four out of five alters that the ego 
scored with emotional closeness 8 or higher. Thus, the 
alters rated as most emotionally close to the ego are likely 
to appear as the most frequent contacts in auto-record 
phone data. We see that the phone data do not docu- 
ment every close ego-alter relationship; to fully capture 
an ego's interaction with all of its alters we would need to 
collect data on phone calls, emails, Facebook communi- 
cations, face-to- face interactions, etc., which would be a 
daunting undertaking. Nonetheless, several studies have 
demonstrated that frequency of contact is a reliable index 
of emotional closeness in relationships [13, HI] , and these 
datasets confirm that frequency of contact by telephone 
and other digital media (text, email) correlates signifi- 
cantly with frequency of face-to-face contact (p<§;0.0001 
in each case, N=1006 and N=8967, respectively). 

Moving beyond the binary accounting of whether or 
not an ego calls an alter given a particular emotional 
closeness and time interval, we calculate the average emo- 
tional closeness scores of alters by their ranked call fre- 
quency. The inset plot in Fig. [1] shows the average emo- 
tional closeness, and standard deviation via error bars, 
of the top 40 most called alters for all egos over all three 
time intervals. In the inset plot, we see that average emo- 
tional closeness decreases with increasing alter rank. It is 
not that alters with low emotional closeness scores are ex- 
cluded from the top ranks of the most called alters, but on 
average the most called alters do have higher emotional 
closeness scores than those alters called less frequently. 



B. Time allocation patterns and their persistence 

For almost all individuals in the survey, the time al- 
location patterns are characterized by a heavy tail that 
decreases slower than exponentially. A large fraction of 
communication is typically allocated to a small number 
of top-ranked alters: for male (female) participants, the 
fraction of calls to the top alter is on average 0.20 ± 0.09 
(0.26 ± 0.08), and the fraction of calls to the top three 
alters is 0.41 ± 0.12 (0.50 ± 0.11). A similar tendency to 
communicate electronically mostly with only a few oth- 
ers has been observed earlier for text messages [29|, H3| 
and Facebook [3l| ■ This shape of the time allocation pat- 
tern is in line with the layered network view, where the 
innermost layer contains a small number of alters with 
close emotional ties that require large maintenance ef- 
fort. It should be noted that the call activity level of each 
participant varies a lot in time; on average, the partici- 
pants made 1030 ± 691 calls per 6-month interval. For 
some individuals, the difference between the maximum 
and minimum 6-month call counts was > 600 calls. 

Figure shows the time allocation patterns for two 
specific egos for each of the three time intervals (a male 
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FIG. 1: Relationship between call pattern and emotional closeness scores attributed to alters. The main figure illustrates the 
fraction of alters, averaged over all egos, (/ (g, Ct i , U)) a , that are actually called by an ego in a 6-month period, h, given that the 
ego scores the alter with emotional closeness ct i in the survey at time ti. The shaded region indicates the standard deviation. 
The inset shows the average emotional closeness of alters of varying rank with error bars showing the standard deviation. 
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FIG. 2: Time allocation patterns for two different egos (survey participants) (top and bottom rows), displaying the fraction of 
calls to each alter called as a function of alter rank, for the three 6-month time intervals (columns). The symbols correspond 
to alters observed for the first time in intervals Ii (circles), I2 (squares), and 73 (diamonds), or to kin (triangles) as reported 
by the egos. The dashed line indicates the time allocation pattern averaged over all 24 egos. 



whose pattern deviates from the average, and a female 
with a pattern close to average), together with the pat- 
tern averaged over all 24 egos. The ego whose patterns 
are depicted in the upper row (panels a to c) is a male 
who went to university in another city, and the lower row 
(panels d to f) represents a female who went to univer- 
sity in City A. For the upper row, the top-ranking alters 
receive a very large fraction of calls and persistently in- 



clude two family members (triangles), whereas for the 
networks in the lower row, the top alters are less domi- 
nant, kin are ranked lower, and kin display larger rank 
fluctuations. 

It is also clear on the basis of Figure [2] that the alter 
composition of the networks undergoes major changes. 
For both egos shown here, the networks corresponding 
to the second 6-month interval (I2) are dominated by 
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FIG. 3: Persistence of time allocation patterns, a) A schematic of how the distances based on Jensen- Shannon divergences are 
calculated. For the focal ego (top row), self-distances (d ae if) are calculated for patterns in consecutive intervals and averaged. 
Reference distances (d re /) are calculated for each interval between the patterns of the focal ego and all other egos (bottom row). 
These are averaged over the three intervals for each pair of egos (focal, other), b) Values of the average self-distances (d se if) 
and histograms for reference distances (dref) for four example egos, c) Distributions of self-distances and reference distances, 
for all egos. 



newcomers, i.e. alters that were first observed in I2. 
This reflects the period of change that the egos are go- 
ing through: I2 represents the first six months of the 
first academic year for those participants who went to 
university. Overall, as quantified by the Jaccard co- 
efficient, the similarities between the sets of alters in 
consecutive intervals, averaged over all respondents, are 
J(h,I 2 ) = 0.20 ± 0.08 and J(h,h) = 0.26 ± 0.09 for 
the full set of alters. Thus there is more turnover be- 
tween intervals I± and 1% (two-sample unequal variance 
f-test: t — 2.126, p = 0.039). However, if we only con- 
sider top 20 ranking alters, the similarities are higher: 
J(h,I 2 ) = 0.34 ± 0.12 and J{h,h) = 0.44 ± 0.10 (the 
means are different with t = 2.961, p = 0.005). Nonethe- 
less, it is clear that the variation is not solely due to high 
turnover in the lowest ranks. 

In order to measure the changes in the time alloca- 
tion patterns over time, we apply the Jensen-Shannon 
divergence as a measure of the distance between pat- 
terns. In order quantify how similar an individual's pat- 
terns for consecutive windows are, we calculated i) the 
distances between one ego's pattern for consecutive win- 
dows, and ii) the averaged distances between the pat- 
terns for the focal ego and all other egos within an in- 
terval (see Fig. [3] a) . We then calculated self and ref- 
erence distances d se if and d re f such that d se lf was av- 
eraged over the two distances between consecutive win- 
dows, d l se y = I (df 2 + djjjs) > where I indicates the focal 
ego and sub-indices denote time intervals. Reference dis- 
tances were averaged for each pair of egos over the three 

time windows, d 1 ^ — | (d^i + ^22 + ^33) i where j de- 
notes non-focal egos. 



The results in Figure [3] (panels b and c) clearly in- 
dicate that on average, the shapes of the time alloca- 
tion patterns of participants (the social signatures) show 
a tendency to persist in time, as the distance values 
dgeif between one participant's consecutive patterns are 
on average much lower than the distances d re f to other 
participants. On average, for each ego, 80%± 13% of 
the distances to others were greater than d se if. Aver- 
aged over all egos, the average self-distance was (d se if) = 
0.037 ± 0.015 while the average distance to other egos 
was (dref) = 0.087±0.044 {(d self ) < (d ref ) with t = 13.3, 
p <C 10~ 6 , two-sample unequal variance t-test). Using 
the £ 2 -norm as an alternative distance measure yields 
a qualitatively similar outcome ((d se if) = 0.10 ± 0.04, 

(d^) = 0.16 ± 0.09, (d ae lf) < (dref) with t = 6.04, 

p < io- 6 ). 



IV. DISCUSSION 

In this study, we used a unique longitudinal dataset, 
combining detailed mobile phone call records with three 
waves of survey data, to examine the personal networks 
of participants during a period of natural flux in their so- 
cial relationships. Our key findings can be summarized as 
follows: (1) There is a clear relationship between the emo- 
tional intensity of alters and the frequency of calls made 
to them. (2) The frequency of calls to alters is broadly 
similar across all individuals, with a small number of top- 
ranked alters receiving a disproportionately large fraction 
of calls. However, we also observe considerable hetero- 
geneity in the detailed pattern of how different individ- 
uals allocate time to their alters. (3) Although network 
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compositions undergo major changes, with many alters 
entering and leaving a network and relationships increas- 
ing and decreasing in intensity, these changes are seen 
to have surprisingly small effects on the time allocation 
patterns. Thus, individuals appear to have a "social sig- 
nature" in that they allocate roughly the same amount 
of time to their alters depending on their rank, indepen- 
dent of who these alters are. Such signature patterns 
show variation between participants but appear persis- 
tent over time for each participant. This provides the 
first direct evidence for the claim [l|, 0] that social net- 
works are constrained in some way either by cognition 
or by the time individuals have available for social in- 
teraction, or both: when new relationships are acquired, 
old ones are inevitably downgraded. As such, this con- 
firms one key assumption underpinning the social brain 
hypothesis that is thought to be responsible for the lay- 
ering of social networks. 

When reflected against the prediction of a layered 
structure of personal networks from the social brain hy- 
pothesis, our observations can broadly speaking be con- 
sidered in accordance with the main characteristics where 
the networks comprise a small number of relationships of 
high emotional intensity, with increasing numbers of re- 
lationships of lower emotional intensity. Discrete layer 
boundaries where the communication frequency drops 
abruptly were observed for some egos (see Fig. [3J top 
row) within some of the time intervals; however, there 



was a lot of individual variation. This is to be expected: 
while emotional intensity was seen to correlate with call 
frequency, calls are only one of the possible communica- 
tion modalities, and social interactions carried out, e.g., 
by face-to-face contacts were not included in our analysis. 

More broadly, our approach shows the value of com- 
bining subjective survey data, (e.g. on the emotional 
intensity of relationships) with the digital traces of 
electronically-mediated communication. Both of these 
sources of data have their limitations, but by combining 
the two, important insights can be gained about how the 
objective pattern of communication relates to the nature 
of our social relationships (e.g. [32|). Future work could 
use this combined data to further our understanding of 
how patterns of communication relate to specific types 
of social tie. For example, if there are clear differences 
in the patterns of mobile communication between family 
members, and communication between friends, it might 
be possible to use these to infer social relationships, based 
solely on communication patterns, from mobile datasets 
where information on the nature of the social interactions 
is lacking. 
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